夜深难眠,不经问自己,是遗憾浪费了昨天,是不想面对今天,还是希望时间停留在这些时刻,等一等迷茫的自己 —– 山河已无恙
写在前面
今天和小伙伴们分享K8s
中Kubelet
组件相关笔记
内容涉及kubelet
运行机制解析包括
节点Kubelet服务
管理
kubeletPod管理
容器健康检查
理解不足小伙伴帮忙指正
夜深难眠,不经问自己,是遗憾浪费了昨天,还是不想面对今天,还是希望时间停留在这些时刻,等一等迷茫的自己 —– 山河已无恙
kubelet运行机制解析 在Kubernetes集群中,在每个Node (又称Minion)上都会启动一个kubelet服务进程
。该进程用于处理Master下发到本节点的任务(调度器调度这个节点的pod)
,管理Pod及Pod中的容器
。可以把kubelet
理解为在Ks集群中Node
节点的全权代理。
每个kubelet
进程都会在API Server(master节点的kube-apiserver服务)
上注册节点自身的信息,定期向Master汇报节点资源的使用情况
,并通过Metrics Server监控容器和节点资源
。
节点管理 Node通过设置kubelet的配置参数registerNode
,来决定是否向master上的API Server服务注册自己。如果该参数的值为true
,那么kubelet
将试着通过API Server
注册自己。默认值为true,在kubelet启动命令--config
指定的配置文件中设定
来看看Node节点的kubelet服务状态
1 2 3 4 5 6 7 8 9 10 11 12 ┌──[root@vms82.liruilongs.github.io]-[~] └─$systemctl status kubelet.service ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Sun 2022-06-12 14:58:48 CST; 1 months 1 days ago Docs: https://kubernetes.io/docs/ Main PID: 970 (kubelet) Memory: 194.4M CGroup: /system.slice/kubelet.service └─970 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --pod-manifest-path=/etc/kubernetes/kubelet.d --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.5
从服务状态可以看到,
当前服务单元的描述信息: kubelet: The Kubernetes Node Agent
这是 一个Kubernetes 代理节点服务
当前服务配置文件状态:loaded
已加载
当前服务的插件配置文件位置:Drop-In: /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
kubelet服务的配置文件位置为: /usr/lib/systemd/system/kubelet.service
,使用的是优先级最低的配置文件,服务的配置文件的优先级为:
本地配置: /etc/systemd/system/
运行时配置: /run/systemd/system/
软件包安装的配置: /usr/lib/systemd/system/
当前设置为开机自动启动,enabled;
软件厂商默认为不开机自启:vendor preset: disabled
服务当前正在运行,运行开始时间;时常:Active: active (running) since Sun 2022-06-12 14:58:48 CST; 1 months 1 days ago
帮助文档位置: Docs: https://kubernetes.io/docs/
进程PID:Main PID: 970 (kubelet)
消耗内存: Memory: 194.4M
Cgroup相关用的什么slice。对应的Cgroup分组: CGroup: /system.slice/kubelet.service
利用Cgroup,我们可以监听当前进程的任务数,CPU、内存和 IO变化
1 2 3 4 5 6 ┌──[root@vms82.liruilongs.github.io]-[~] └─$watch -n 3 -d 'systemd-cgtop | grep /system.slice/kubelet.service' Every 3.0s: systemd-cgtop | grep /system.slice/kubelet.service Thu Jul 14 23:53:40 2022 /system.slice/kubelet.service 1 - 194.7M - -
在上面的Service中,除了单元文件 kubelet.service
之外,还有一个Drop-In
目录 /usr/lib/systemd/system/kubelet.service.d
。这个目录中所有后缀为”.conf”的文件将在单元文件本身被解析之后被解析
,所以说kubelet.service
单元文件的值可能会被覆盖。
通过systemctl cat kubelet.service
可以查看所有的Service单元相关文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 ┌──[root@vms82.liruilongs.github.io]-[/usr/lib/systemd/system/kubelet.service.d] └─$systemctl cat kubelet.service [Unit] Description=kubelet: The Kubernetes Node Agent Documentation=https://kubernetes.io/docs/ Wants=network-online.target After=network-online.target [Service] ExecStart=/usr/bin/kubelet Restart=always StartLimitInterval=0 RestartSec=10 [Install] WantedBy=multi-user.target [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --pod-manifest-path=/etc/kubernetes/kube Environment=" KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml" # This is a file that " kubeadm init" and " kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file. EnvironmentFile=-/etc/sysconfig/kubelet ExecStart= ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
这里的Target
单元:即用于模拟实现“运行级别”
,文件扩展名为.target,可以理解Target 就是一个 Unit组,包含许多相关的单元,可以是Service,Socket,Device 等,
在红帽的Linux发行版中,CentOS7之后,采用加载target的方式来替代之前的启动级别。有两个常见的target:multi-user.target
与graphical.target
。它们分别表示之前运行级别中的3(字符模式+NFS)
与5(图像模式)
级别。
列出当前使用的运行级别
1 2 3 ┌──[root@vms82.liruilongs.github.io]-[/etc/kubernetes/manifests] └─$systemctl get-default multi-user.target
在较新版本中,kubelet弃用了大部分的启动参数
,保留了较小的部分,大部分的启动参数通--config
启动参数所给的配置文件中进行设置,也就是配置参数
/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
为进程启动时最后的启动命令,具体的启动参数说明
1 2 3 4 5 6 7 /usr/bin/kubelet \ --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf \ --kubeconfig=/etc/kubernetes/kubelet.conf \ --pod-manifest-path=/etc/kubernetes/kubelet.d \ --config=/var/lib/kubelet/config.yaml \ --network-plugin=cni \ --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.5
systemctl show kubelet.service
可用于查看所有的参数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ┌──[root@vms82.liruilongs.github.io]-[/etc/kubernetes/manifests] └─$systemctl show kubelet.service Type=simple Restart=always NotifyAccess=none RestartUSec=10s TimeoutStartUSec=1min 30s TimeoutStopUSec=1min 30s WatchdogUSec=0 WatchdogTimestamp=日 2022-06-12 14:58:48 CST WatchdogTimestampMonotonic=21325336 StartLimitInterval=0 StartLimitBurst=5 StartLimitAction=none FailureAction=none ......
查看kubelet.service
的正向依赖,所谓正向依赖,是在kubelet之前启动的单元
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 ┌──[root@vms82.liruilongs.github.io]-[~] └─$systemctl list-dependencies kubelet.service kubelet.service ● ├─system.slice ● ├─basic.target ● │ ├─microcode.service ● │ ├─rhel-autorelabel-mark.service ● │ ├─rhel-autorelabel.service ● │ ├─rhel-configure.service ● │ ├─rhel-dmesg.service ● │ ├─rhel-loadmodules.service ● │ ├─selinux-policy-migrate-local-changes@targeted.service ● │ ├─paths.target ● │ ├─slices.target ● │ │ ├─-.slice ● │ │ └─system.slice ● │ ├─sockets.target ● │ │ ├─dbus.socket ● │ │ ├─rpcbind.socket ● │ │ ├─systemd-initctl.socket ● │ │ ├─systemd-journald.socket ● │ │ ├─systemd-shutdownd.socket ● │ │ ├─systemd-udevd-control.socket ● │ │ └─systemd-udevd-kernel.socket ● │ ├─sysinit.target ● │ │ ├─dev-hugepages.mount ● │ │ ├─dev-mqueue.mount ● │ │ ├─kmod-static-nodes.service ● │ │ ├─plymouth-read-write.service ● │ │ ├─plymouth-start.service ● │ │ ├─proc-sys-fs-binfmt_misc.automount ● │ │ ├─sys-fs-fuse-connections.mount ● │ │ ├─sys-kernel-config.mount ● │ │ ├─sys-kernel-debug.mount ● │ │ ├─systemd-ask-password-console.path ● │ │ ├─systemd-binfmt.service ● │ │ ├─systemd-firstboot.service ● │ │ ├─systemd-hwdb-update.service ● │ │ ├─systemd-journal-catalog-update.service ● │ │ ├─systemd-journal-flush.service ● │ │ ├─systemd-journald.service ● │ │ ├─systemd-machine-id-commit.service ● │ │ ├─systemd-modules-load.service ● │ │ ├─systemd-random-seed.service ● │ │ ├─systemd-sysctl.service ● │ │ ├─systemd-tmpfiles-setup-dev.service ● │ │ ├─systemd-tmpfiles-setup.service ● │ │ ├─systemd-udev-trigger.service ● │ │ ├─systemd-udevd.service ● │ │ ├─systemd-update-done.service ● │ │ ├─systemd-update-utmp.service ● │ │ ├─systemd-vconsole-setup.service ● │ │ ├─cryptsetup.target ● │ │ ├─local-fs.target ● │ │ │ ├─-.mount ● │ │ │ ├─rhel-import-state.service ● │ │ │ ├─rhel-readonly.service ● │ │ │ └─systemd-remount-fs.service ● │ │ └─swap.target ● │ └─timers.target ● │ └─systemd-tmpfiles-clean.timer ● └─network-online.target ● └─NetworkManager-wait-online.service
查看kubelet.service
的反向依赖,在kubelet
之后启动的单元
1 2 3 4 5 6 7 8 ┌──[root@vms82.liruilongs.github.io]-[~] └─$systemctl list-dependencies kubelet.service --reverse kubelet.service ● └─multi-user.target ● └─graphical.target ┌──[root@vms82.liruilongs.github.io]-[~] └─$systemctl list-dependencies graphical.target | grep kube ● ├─kubelet.service
--kubeconfig=/etc/kubernetes/kubelet.conf
认证文件,使用 kubelet.conf
文件来组织有关集群、用户、命名空间和身份认证机制的信息。kubectl 命令行工具使用 kubeconfig 文件来查找选择集群所需的信息,并与集群的 API 服务器进行通信。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 ┌──[root@vms82.liruilongs.github.io]-[/etc/kubernetes/manifests] └─$cat /etc/kubernetes/kubelet.conf apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJeE1USXhNakUyTURBME1sb1hEVE14TVRJeE1ERTJNREEwTWxvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTkdkCisrWnhFRDJRQlR2Rm5ycDRLNFBrd2lsYXUrNjdXNTVobVdwc09KSHF6ckVoWUREY3l4ZTU2Z1VJVDFCUTFwbU0KcGFrM0V4L0JZRStPeHY4ZmxtellGbzRObDZXQjl4VXovTW5HQi96dHZsTGpaVEVHZy9SVlNIZTJweCs2MUlSMQo2Mkh2OEpJbkNDUFhXN0pmR3VXNDdKTXFUNTUrZUNuR00vMCtGdnI2QUJnT2YwNjBSSFFuaVlzeGtpSVJmcjExClVmcnlPK0RFTGJmWjFWeDhnbi9tcGZEZ044cFgrVk9FNFdHSDVLejMyNDJtWGJnL3A0emd3N2NSalpSWUtnVlUKK2VNeVIyK3pwaTBhWW95L2hLYmg4RGRUZ3FZeERDMzR6NHFoQ3RGQnVia1hmb3Ftc3FGNXpQUm1ZS051RUgzVAo2c1FNSFl4emZXRkZvSGQ2Y0JNQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZHRGNLU3V1VjVNNXlaTkJHUDEvNmg3TFk3K2VNQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBRVE0SUJhM0hBTFB4OUVGWnoyZQpoSXZkcmw1U0xlanppMzkraTdheC8xb01SUGZacElwTzZ2dWlVdHExVTQ2V0RscTd4TlFhbVVQSFJSY1RrZHZhCkxkUzM5Y1UrVzk5K3lDdXdqL1ZrdzdZUkpIY0p1WCtxT1NTcGVzb3lrOU16NmZxNytJUU9lcVRTbGpWWDJDS2sKUFZxd3FVUFNNbHFNOURMa0JmNzZXYVlyWUxCc01EdzNRZ3N1VTdMWmg5bE5TYVduSzFoR0JKTnRndjAxdS9MWAo0TnhKY3pFbzBOZGF1OEJSdUlMZHR1dTFDdEFhT21CQ2ZjeTBoZHkzVTdnQXh5blR6YU1zSFFTamIza0JDMkY5CkpWSnJNN1FULytoMStsOFhJQ3ZLVzlNM1FlR0diYm13Z1lLYnMvekswWmc1TE5sLzFJVThaTUpPREhTVVBlckQKU09ZPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://192.168.26.81:6443 name: default-cluster contexts: - context: cluster: default-cluster namespace: default user: default-auth name: default-context current-context: default-context kind: Config preferences: {} users: - name: default-auth user: client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
关于认证文件的生成小伙伴们可以看看我之前的文章。
关于Kubernetes中API Server使用token、kubeconfig文件认证的一些笔记 https://liruilong.blog.csdn.net/article/details/122694838
--config=/var/lib/kubelet/config.yaml
启动参数配置文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 ┌──[root@vms82.liruilongs.github.io]-[~] └─$cat /var/lib/kubelet/config.yaml apiVersion: kubelet.config.k8s.io/v1beta1 authentication: anonymous: enabled: false webhook: cacheTTL: 0s enabled: true x509: clientCAFile: /etc/kubernetes/pki/ca.crt authorization: mode: Webhook webhook: cacheAuthorizedTTL: 0s cacheUnauthorizedTTL: 0s cgroupDriver: systemd clusterDNS: - 10.96.0.10 clusterDomain: cluster.local cpuManagerReconcilePeriod: 0s evictionPressureTransitionPeriod: 0s fileCheckFrequency: 0s healthzBindAddress: 127.0.0.1 healthzPort: 10248 httpCheckFrequency: 0s imageMinimumGCAge: 0s kind: KubeletConfiguration logging: {} memorySwap: {} nodeStatusReportFrequency: 0s nodeStatusUpdateFrequency: 0s rotateCertificates: true runtimeRequestTimeout: 0s shutdownGracePeriod: 0s shutdownGracePeriodCriticalPods: 0s staticPodPath: /etc/kubernetes/manifests streamingConnectionIdleTimeout: 0s syncFrequency: 0s volumeStatsAggPeriod: 0s
参数这些官网都有详细的介绍,小伙伴遇到需要查询可以移步官网
如果在集群运行过程中遇到集群资源不足
的情况,可以通过添加机器及运用kubelet的自注册模式来实现扩容
。在某些情况下,Kubernetes集群中的某些kubelet没有选择自注册模式,用户需要自己去配置Node的资源信息,同时告知Node上Kubelet API Server的位置。一般情况下,如果有成熟的安装工具,比如kubeadm等,还是使用工具方便一点。
集群管理者能够创建和修改节点信息。如果管理者希望手动创建节点信息,则通过设置kubelet的配置参数“registerNode
”即可完成。
kubelet
在启动时通过API Server
注册节点信息,并定时
向API Server
发送节点的新消息,API Server
在接收到这些信息后,将这些信息写入etcd
。
通过kubelet
的配置参数 nodeStatusUpdateFrequency
用于设置kubelet
每隔多长时间向API Server
报告节点状态,默认为10s。注意:更改此常量时请务必谨慎, 它必须与节点控制器中的nodeMonitorGracePeriod
一起使用。
通过journalctl -u kubelet.service
查看日志,有的时候,kubelet服务可能死掉,通过journalctl
来排除问题
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 ┌──[root@vms82.liruilongs.github.io]-[~] └─$ journalctl -u kubelet.service -- Logs begin at 日 2022-06-12 14:58:36 CST, end at 四 2022-07-14 22:16:38 CST. -- 6月 12 14:58:48 vms82.liruilongs.github.io systemd[1]: Started kubelet: The Kubernetes Node Agent. 6月 12 14:58:48 vms82.liruilongs.github.io systemd[1]: Starting kubelet: The Kubernetes Node Agent... 6月 12 14:58:49 vms82.liruilongs.github.io kubelet[970]: Flag --pod-manifest-path has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. 6月 12 14:58:49 vms82.liruilongs.github.io kubelet[970]: Flag --network-plugin has been deprecated, will be removed along with dockershim. 6月 12 14:58:49 vms82.liruilongs.github.io kubelet[970]: Flag --pod-manifest-path has been deprecated, This parameter should be set via the config file specified by the Kubelet' s --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.6月 12 14:58:49 vms82.liruilongs.github.io kubelet[970]: Flag --network-plugin has been deprecated, will be removed along with dockershim. 6月 12 14:58:49 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:49.396358 970 server.go:440] "Kubelet version" kubeletVersion="v1.22.2" 6月 12 14:58:49 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:49.397298 970 server.go:868] "Client rotation is on, will bootstrap in background" 6月 12 14:58:49 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:49.409027 970 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem" . 6月 12 14:58:49 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:49.422834 970 dynamic_cafile_content.go:155] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt" 6月 12 14:58:50 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:50.834139 970 server.go:687] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /" 6月 12 14:58:50 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:50.836548 970 container_manager_linux.go:280] "Container manager verified user specified cgroup-root exists" cgroupRoot=[] 6月 12 14:58:50 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:50.836913 970 container_manager_linux.go:285] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container ExperimentalCPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none} 6月 12 14:58:50 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:50.838174 970 topology_manager.go:133] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container" 6月 12 14:58:50 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:50.838223 970 container_manager_linux.go:320] "Creating device plugin manager" devicePluginEnabled=true 6月 12 14:58:50 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:50.838441 970 state_mem.go:36] "Initialized new in-memory state store" 6月 12 14:58:50 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:50.839147 970 kubelet.go:314] "Using dockershim is deprecated, please consider using a full-fledged CRI implementation" 6月 12 14:58:50 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:50.840462 970 client.go:78] "Connecting to docker on the dockerEndpoint" endpoint="unix:///var/run/docker.sock" 6月 12 14:58:50 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:50.840495 970 client.go:97] "Start docker client with request timeout" timeout="2m0s" 6月 12 14:58:50 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:50.851651 970 docker_service.go:566] "Hairpin mode is set but kubenet is not enabled, falling back to HairpinVeth" hairpinMode=promiscuous-bridge 6月 12 14:58:50 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:50.852141 970 docker_service.go:242] "Hairpin mode is set" hairpinMode=hairpin-veth 6月 12 14:58:51 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:51.041739 970 docker_service.go:257] "Docker cri networking managed by the network plugin" networkPluginName="cni" 6月 12 14:58:51 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:51.056895 970 docker_service.go:264] "Docker Info" dockerInfo=&{ID:IBJD:6MIX:4FUA:Z6W3:UIL2:VGXR:K7PS:PN3X:BVBO:5MKQ:D3WY:6JJY Containers:37 ContainersRunning:0 ContainersPaused:0 ContainersStopped:37 Images:56 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true ] [Native Overlay Diff true ] [userxattr false ]] SystemStatus:[] Plugins:{Volume:[local ] Network:[bridge host ipvlan macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:true KernelMemoryTCP:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true PidsLimit:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:26 OomKillDisable:true NGoroutines:37 SystemTime:2022-06-12T14:58:51.042314205+08:00 LoggingDriver:json-file CgroupDriver:systemd CgroupVersion:1 NEventsListener:0 KernelVersion:3.10.0-693.el7.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSVersion:7 OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc0007a4230 NCPU:3 MemTotal:5104164864 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:vms82.liruilongs.github.io Labels:[] ExperimentalBuild:false ServerVersion:20.10.9 ClusterStore: ClusterAdvertise: Runtimes:map[io.containerd.runc.v2:{Path:runc Args:[] Shim:<nil>} io.containerd.runtime.v1.linux:{Path:runc Args:[] Shim:<nil>} runc:{Path:runc Args:[] Shim:<nil>}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil> Warnings:[]} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:5b46e404f6b9f661a205e28d59c982d3634148f8 Expected:5b46e404f6b9f661a205e28d59c982d3634148f8} RuncCommit:{ID:v1.0.2-0-g52b36a2 Expected:v1.0.2-0-g52b36a2} InitCommit:{ID:de40ad0 Expected:de40ad0} SecurityOptions:[name=seccomp,profile=default] ProductLicense: DefaultAddressPools:[] 6月 12 14:58:51 vms82.liruilongs.github.io kubelet[970]: Warnings:[]} 6月 12 14:58:51 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:51.056930 970 docker_service.go:277] "Setting cgroupDriver" cgroupDriver="systemd" 6月 12 14:58:51 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:51.098194 970 kubelet.go:418] "Attempting to sync node with API server" 6月 12 14:58:51 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:51.098254 970 kubelet.go:279] "Adding static pod path" path="/etc/kubernetes/kubelet.d" 6月 12 14:58:51 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:51.098886 970 kubelet.go:290] "Adding apiserver pod source" 6月 12 14:58:51 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:51.099026 970 apiserver.go:42] "Waiting for node sync before watching apiserver pods" 6月 12 14:58:51 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:51.113064 970 kuberuntime_manager.go:244] "Container runtime initialized" containerRuntime="docker" version="20.10.9" apiVersion="1.41.0" 6月 12 14:58:51 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:51.117575 970 server.go:1213] "Started kubelet" 6月 12 14:58:51 vms82.liruilongs.github.io kubelet[970]: I0612 14:58:51.118010 970 server.go:149] "Starting to listen" address="0.0.0.0" port=10250
Pod管理 kubelet
根据 PodSpec
工作。PodSpec
是描述 pod 的 YAML 或 JSON 对象
。kubelet 采用一组通过各种机制(主要通过 apiserver)提供的 PodSpec,并确保这些 PodSpec 中描述的容器运行且健康。kubelet 不会管理不是由 Kubernetes 创建的容器。
kubelet通过以下几种方式获取自身Node上要运行的Pod清单
。
文件
:kubelet启动参数 --config
指定的配置文件目录下的文件(默认目录为“/etc/kubernetes/manifests/
”)。通过fileCheckFrequency
设置检查该文件目录的时间间隔,默认为20s。
HTTP端点(URL)
:通过“-manifest-url
”参数设置。通过--http-check-frequency
设置检查该HTTP端点数据的时间间隔
,默认为20s。
API Server
:kubelet
通过API Server监听etcd目录
,同步Pod列表
。
所有以非API Server方式创建的Pod都叫作Static Pod
。kubelet将Static Pod
的状态汇报给API Server
,API Server
为该Static Pod
创建一个Mirror Pod
和其相匹配。Mirror Pod
的状态将真实反映Static Pod
的状态。
当Static Pod被删除时,与之相对应的Mirror Pod也会被删除
。
kubelet通过API Server Client
使用Watch加List
的方式监听“/registry/nodes/$
”当前节点的名称和“registry/pods
”目录,将获取的信息同步到本地缓存中。kubelet监听etcd
,所有针对Pod的操作都会被kubelet监听。如果发现有新的绑定到本节点的Pod,则按照Pod清单的要求创建该Pod。
kubelet读取监听到的信息,如果是创建和修改Pod任务
,则做如下处理。
为该Pod创建一个数据目录。
从API Server读取该Pod清单。
为该Pod挂载外部卷(External Volume)。
下载Pod用到的Secret
。
检查已经运行在节点上的Pod,如果该Pod没有容器
或Pause
容器(“kubernetes/pause”镜像创建的容器)没有启动,则先停止Pod
里所有容器的进程
。如果在Pod中有需要删除的容器
,则删除这些容器。
用kubernetes/pause
镜像为每个Pod都创建一个容器。该Pause容器
用于接管Pod
中所有其他容器的网络
。每创建一个新的Pod
,kubelet都会先创建一个Pause容器
,然后创建其他容器。“kubernetes/pause”
镜像大概有200KB,是个非常小的容器镜像。
为Pod中的每个容器做如下处理。
为容器计算一个Hash值,然后用容器的名称去查询对应Docker容器的Hash值。若查找到容器,且二者的Hash值不同,则停止Docker中容器的进程,并停止与之关联的Pause容器的进程;若二者相同,则不做任何处理。
如果容器被终止了,且容器没有指定的restartPolicy(重启策略),则不做任何处理。
调用Docker Client下载容器镜像,调用Docker Client运行容器。
容器健康检查 Pod通过两类探针来检查容器的健康状态。
LivenessProbe探针(存活) 一类是LivenessProbe探针,用于判断容器是否健康并反馈给kubelet。如果LivenessProbe探针探测到容器不健康,则kubelet将删除该容器,并根据容器的重启策略做相应的处理。如果一个容器不包含LivenessProbe探针,那么kubelet认为该容器的LivenessProbe探针返回的值永远是Success;
一个 ReadinessProbe
探针Demo,在容器内部执行一个命令,如果该命令的返回码为0,则表明容器健康。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/liveness-probe] └─$cat liveness-probe.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: pod-liveness name: pod-liveness spec: containers: - args: - /bin/sh - -c - touch /tmp/healthy; sleep 30 ; rm -rf /tmp/healthy; slee 10 livenessProbe: exec: command: - cat - /tmp/healthy initialDelaySeconds: 5 periodSeconds: 5 image: busybox imagePullPolicy: IfNotPresent name: pod-liveness resources: {} dnsPolicy: ClusterFirst restartPolicy: Always status: {}
ReadinessProbe探针(服务可用) 另一类是ReadinessProbe探针,用于判断容器是否启动完成,且准备接收请求。如果ReadinessProbe探针检测到容器启动失败,则Pod的状态将被修改,Endpoint Controller将从Service的Endpoint中删除包含该容器所在Pod的IP地址的Endpoint条目。
kubelet定期调用容器中的LivenessProbe探针来诊断容器的健康状况。LivenessProbe包含以下3种实现方式。
ExecAction
:在容器内部执行一个命令,如果该命令的退出状态码为0,则表明容器健康。
TCPSocketAction
:通过容器的IP地址和端口号执行TCP检查,如果端口能被访问,则表明容器健康。
HTTPGetAction
:通过容器的IP地址和端口号及路径调用HTTP Get方法,如果响应的状态码大于等于200且小于等于400,则认为容器状态健康。
一个 ReadinessProbe
探针Demo,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 ┌──[root@vms81.liruilongs.github.io]-[~/ansible/liveness-probe] └─$cat liveness-probe-tcp.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: pod-livenss-probe name: pod-livenss-probe spec: containers: - image: nginx imagePullPolicy: IfNotPresent name: pod-livenss-probe livenessProbe: failureThreshold: 3 tcpSocket: port: 8080 initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 10 resources: {} dnsPolicy: ClusterFirst restartPolicy: Always status: {}
关于更多小伙伴们可以看看我之前的博文关于
Kubernetes中Pod健康检测和服务可用性检查的一些笔记 https://blog.csdn.net/sanhewuyang/article/details/122020019
资源监控 在新的Kubernetes监控体系中,Metrics Server用于提供Core Metrics(核心指标),包括Node和Pod的CPU和内存使用数据。其他Custom Metrics(自定义指标)则由第三方组件(如Prometheus)采集和存储。这里感兴趣的小伙伴可以看看我之前的博文
关于 Kubernetes集群性能监控(kube-prometheus-stack/Metrics Server)的一些笔记 https://liruilong.blog.csdn.net/article/details/122729697
博客内容整理参考